.TH std::codecvt 3 "2024.06.10" "http://cppreference.com" "C++ Standard Libary"
.SH NAME
std::codecvt \- std::codecvt

.SH Synopsis
   Defined in header <locale>
   template<

       class InternT,
       class ExternT,
       class StateT

   > class codecvt;

   Class template std::codecvt encapsulates conversion of character strings, including
   wide and multibyte, from one encoding to another. All file I/O operations performed
   through std::basic_fstream<CharT> use the std::codecvt<CharT, char, std::mbstate_t>
   facet of the locale imbued in the stream.

   std-codecvt-inheritance.svg

                                   Inheritance diagram

.SH Specializations

   The standard library is guaranteed to provide the following specializations (they
   are required to be implemented by any locale object):

   Defined in header <locale>
   std::codecvt<char, char, std::mbstate_t>  identity conversion
   std::codecvt<char16_t, char,
   std::mbstate_t>                           conversion between UTF-16 and UTF-8
   \fI(since C++11)\fP(deprecated in C++20)
   std::codecvt<char16_t, char8_t,
   std::mbstate_t>                           conversion between UTF-16 and UTF-8
   \fI(since C++20)\fP
   std::codecvt<char32_t, char,
   std::mbstate_t>                           conversion between UTF-32 and UTF-8
   \fI(since C++11)\fP(deprecated in C++20)
   std::codecvt<char32_t, char8_t,
   std::mbstate_t>                           conversion between UTF-32 and UTF-8
   \fI(since C++20)\fP
   std::codecvt<wchar_t, char,               conversion between the system's native
   std::mbstate_t>                           wide and the single-byte narrow character
                                             sets

.SH Member types

   Member name Definition
   intern_type InternT
   extern_type ExternT
   state_type  StateT

.SH Member functions

   constructor   constructs a new codecvt facet
                 \fI(public member function)\fP
   out           invokes do_out
                 \fI(public member function)\fP
   in            invokes do_in
                 \fI(public member function)\fP
   unshift       invokes do_unshift
                 \fI(public member function)\fP
   encoding      invokes do_encoding
                 \fI(public member function)\fP
   always_noconv invokes do_always_noconv
                 \fI(public member function)\fP
   length        invokes do_length
                 \fI(public member function)\fP
   max_length    invokes do_max_length
                 \fI(public member function)\fP

.SH Member objects

   static std::locale::id id id of the locale
                             \fI(public member object)\fP

.SH Protected member functions

   destructor       destructs a codecvt facet
                    \fI(protected member function)\fP
   do_out           converts a string from InternT to ExternT, such as when writing to
   \fB[virtual]\fP        file
                    \fI(virtual protected member function)\fP
   do_in            converts a string from ExternT to InternT, such as when reading
   \fB[virtual]\fP        from file
                    \fI(virtual protected member function)\fP
   do_unshift       generates the termination character sequence of ExternT characters
   \fB[virtual]\fP        for incomplete conversion
                    \fI(virtual protected member function)\fP
   do_encoding      returns the number of ExternT characters necessary to produce one
   \fB[virtual]\fP        InternT character, if constant
                    \fI(virtual protected member function)\fP
   do_always_noconv tests if the facet encodes an identity conversion for all valid
   \fB[virtual]\fP        argument values
                    \fI(virtual protected member function)\fP
   do_length        calculates the length of the ExternT string that would be consumed
   \fB[virtual]\fP        by conversion into given InternT buffer
                    \fI(virtual protected member function)\fP
   do_max_length    returns the maximum number of ExternT characters that could be
   \fB[virtual]\fP        converted into a single InternT character
                    \fI(virtual protected member function)\fP

Inherited from std::codecvt_base

   Member type                                 Definition
   enum result { ok, partial, error, noconv }; Unscoped enumeration type

   Enumeration constant Definition
   ok                   conversion was completed with no error
   partial              not all source characters were converted
   error                encountered an invalid character
   noconv               no conversion required, input and output types are the same

.SH Example

   The following examples reads a UTF-8 file using a locale which implements UTF-8
   conversion in codecvt<wchar_t, char, std::mbstate_t> and converts a UTF-8 string to
   UTF-16 using one of the standard specializations of std::codecvt.


// Run this code

 #include <codecvt>
 #include <cstdint>
 #include <fstream>
 #include <iomanip>
 #include <iostream>
 #include <locale>
 #include <string>

 // utility wrapper to adapt locale-bound facets for wstring/wbuffer convert
 template<class Facet>
 struct deletable_facet : Facet
 {
     template<class... Args>
     deletable_facet(Args&&... args) : Facet(std::forward<Args>(args)...) {}
     ~deletable_facet() {}
 };

 int main()
 {
     // UTF-8 narrow multibyte encoding
     std::string data = reinterpret_cast<const char*>(+u8"z\\u00df\\u6c34\\U0001f34c");
                        // or reinterpret_cast<const char*>(+u8"zß水🍌")
                        // or "\\x7a\\xc3\\x9f\\xe6\\xb0\\xb4\\xf0\\x9f\\x8d\\x8c"

     std::ofstream("text.txt") << data;

     // using system-supplied locale's codecvt facet
     std::wifstream fin("text.txt");
     // reading from wifstream will use codecvt<wchar_t, char, std::mbstate_t>
     // this locale's codecvt converts UTF-8 to UCS4 (on systems such as Linux)
     fin.imbue(std::locale("en_US.UTF-8"));
     std::cout << "The UTF-8 file contains the following UCS4 code units:\\n" << std::hex;
     for (wchar_t c; fin >> c;)
         std::cout << "U+" << std::setw(4) << std::setfill('0')
                   << static_cast<uint32_t>(c) << ' ';

     // using standard (locale-independent) codecvt facet
     std::wstring_convert<
         deletable_facet<std::codecvt<char16_t, char, std::mbstate_t>>, char16_t> conv16;
     std::u16string str16 = conv16.from_bytes(data);

     std::cout << "\\n\\nThe UTF-8 file contains the following UTF-16 code units:\\n"
               << std::hex;
     for (char16_t c : str16)
         std::cout << "U+" << std::setw(4) << std::setfill('0')
                   << static_cast<uint16_t>(c) << ' ';
     std::cout << '\\n';
 }

.SH Output:

 The UTF-8 file contains the following UCS4 code units:
 U+007a U+00df U+6c34 U+1f34c

 The UTF-8 file contains the following UTF-16 code units:
 U+007a U+00df U+6c34 U+d83c U+df4c

.SH See also

  Character       locale-defined multibyte                   UTF-8                       UTF-16
 conversions          (UTF-8, GB18030)
                                                codecvt<char16_t,char,mbstate_t>
   UTF-16     mbrtoc16 / c16rtomb (with C11's   codecvt_utf8_utf16<char16_t>     N/A
              DR488)                            codecvt_utf8_utf16<char32_t>
                                                codecvt_utf8_utf16<wchar_t>
    UCS-2     c16rtomb (without C11's DR488)    codecvt_utf8<char16_t>           codecvt_utf16<char16_t>
   UTF-32     mbrtoc32 / c32rtomb               codecvt<char32_t,char,mbstate_t> codecvt_utf16<char32_t>
                                                codecvt_utf8<char32_t>
   system
  wchar_t:
              mbsrtowcs / wcsrtombs
   UTF-32     use_facet<codecvt                 codecvt_utf8<wchar_t>            codecvt_utf16<wchar_t>
(non-Windows) <wchar_t,char,mbstate_t>>(locale)
    UCS-2
  (Windows)

   codecvt_base          defines character conversion errors
                         \fI(class)\fP
                         represents the system-supplied std::codecvt for the named
   codecvt_byname        locale
                         \fI(class template)\fP
   codecvt_utf8
   \fI(C++11)\fP               converts between UTF-8 and UCS-2/UCS-4
   (deprecated in C++17) \fI(class template)\fP
   (removed in C++26)
   codecvt_utf16
   \fI(C++11)\fP               converts between UTF-16 and UCS-2/UCS-4
   (deprecated in C++17) \fI(class template)\fP
   (removed in C++26)
   codecvt_utf8_utf16
   \fI(C++11)\fP               converts between UTF-8 and UTF-16
   (deprecated in C++17) \fI(class template)\fP
   (removed in C++26)
